bankruptcy prediction
ARCADIA: Scalable Causal Discovery for Corporate Bankruptcy Analysis Using Agentic AI
Maturo, Fabrizio, Riccio, Donato, Mazzitelli, Andrea, Bifulco, Giuseppe, Paolone, Francesco, Brezeanu, Iulia
Iteration 1 uses a broad, data-driven prior; subsequent iterations exploit memory to execute focused, theory-driven repairs, steadily converging on a causally defensible graph. This iterative loop is made explicit in Algorithm 1, while the statistics used during Evaluate are summarised in Table 2 and computed procedurally in Algorithm 2. 3.1. Causal Assumptions Every proposed DAG must explicitly address the four core assumptions required for causal identification. First, regarding unobserved confounding, the agent must state which latent factors remain and how observed variables serve as proxies for these unobserved influences. Second, the positivity assumption requires that the agent argue no sub-population is locked into or out of the treatment, often demonstrated by reporting overlap in the propensity-score distribution across treatment groups.
- North America > United States (0.05)
- Europe > Italy > Lazio > Rome (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Banking & Finance (1.00)
- Law > Business Law (0.46)
Are Foundation Models Useful for Bankruptcy Prediction?
Kostrzewa, Marcin, Furman, Oleksii, Furman, Roman, Tomczak, Sebastian, Zięba, Maciej
Foundation models have shown promise across various financial applications, yet their effectiveness for corporate bankruptcy prediction remains systematically unevaluated against established methods. We study bankruptcy forecasting using Llama-3.3-70B-Instruct and TabPFN, evaluated on large, highly imbalanced datasets of over one million company records from the Visegrád Group. We provide the first systematic comparison of foundation models against classical machine learning baselines for this task. Our results show that models such as XGBoost and CatBoost consistently outperform foundation models across all prediction horizons. LLM-based approaches suffer from unreliable probability estimates, undermining their use in risk-sensitive financial settings. TabPFN, while competitive with simpler baselines, requires substantial computational resources with costs not justified by performance gains. These findings suggest that, despite their generality, current foundation models remain less effective than specialized methods for bankruptcy forecasting.
Enhancing Bankruptcy Prediction of Banks through Advanced Machine Learning Techniques: An Innovative Approach and Analysis
Rustam, Zuherman, Hartini, Sri, Islam, Sardar M. N., Novkaniza, Fevi, Aszhari, Fiftitah R., Rifqi, Muhammad
Context: Financial system stability is determined by the condition of the banking system. A bank failure can destroy the stability of the financial system, as banks are subject to systemic risk, affecting not only individual banks but also segments or the entire financial system. Calculating the probability of a bank going bankrupt is one way to ensure the banking system is safe and sound. Existing literature and limitations: Statistical models, such as Altman's Z-Score, are one of the common techniques for developing a bankruptcy prediction model. However, statistical methods rely on rigid and sometimes irrelevant assumptions, which can result in low forecast accuracy. New approaches are necessary. Objective of the research: Bankruptcy models are developed using machine learning techniques, such as logistic regression (LR), random forest (RF), and support vector machines (SVM). According to several studies, machine learning is also more accurate and effective than statistical methods for categorising and forecasting banking risk management. Present Research: The commercial bank data are derived from the annual financial statements of 44 active banks and 21 bankrupt banks in Turkey from 1994 to 2004, and the rural bank data are derived from the quarterly financial reports of 43 active and 43 bankrupt rural banks in Indonesia between 2013 and 2019. Five rural banks in Indonesia have also been selected to demonstrate the feasibility of analysing bank bankruptcy trends. Findings and implications: The results of the research experiments show that RF can forecast data from commercial banks with a 90% accuracy rate. Furthermore, the three machine learning methods proposed accurately predict the likelihood of rural bank bankruptcy. Contribution and Conclusion: The proposed innovative machine learning approach help to implement policies that reduce the costs of bankruptcy.
- Asia > Middle East > Republic of Türkiye (0.26)
- Europe > United Kingdom (0.04)
- North America > United States > New York (0.04)
- (9 more...)
- Banking & Finance > Financial Services (0.49)
- Information Technology > Security & Privacy (0.48)
Missing Data Imputation With Granular Semantics and AI-driven Pipeline for Bankruptcy Prediction
Chakraborty, Debarati, Ranjan, Ravi
This work focuses on designing a pipeline for the prediction of bankruptcy. The presence of missing values, high dimensional data, and highly class-imbalance databases are the major challenges in the said task. A new method for missing data imputation with granular semantics has been introduced here. The merits of granular computing have been explored here to define this method. The missing values have been predicted using the feature semantics and reliable observations in a low-dimensional space, in the granular space. The granules are formed around every missing entry, considering a few of the highly correlated features and most reliable closest observations to preserve the relevance and reliability, the context, of the database against the missing entries. An intergranular prediction is then carried out for the imputation within those contextual granules. That is, the contextual granules enable a small relevant fraction of the huge database to be used for imputation and overcome the need to access the entire database repetitively for each missing value. This method is then implemented and tested for the prediction of bankruptcy with the Polish Bankruptcy dataset. It provides an efficient solution for big and high-dimensional datasets even with large imputation rates. Then an AI-driven pipeline for bankruptcy prediction has been designed using the proposed granular semantic-based data filling method followed by the solutions to the issues like high dimensional dataset and high class-imbalance in the dataset. The rest of the pipeline consists of feature selection with the random forest for reducing dimensionality, data balancing with SMOTE, and prediction with six different popular classifiers including deep NN. All methods defined here have been experimentally verified with suitable comparative studies and proven to be effective on all the data sets captured over the five years.
- Europe > United Kingdom > England > East Yorkshire > Hull (0.14)
- North America > United States (0.04)
- Asia > Singapore (0.04)
- Asia > India > Telangana > Hyderabad (0.04)
Using multimodal learning and deep generative models for corporate bankruptcy prediction
Mancisidor, Rogelio A., Aas, Kjersti
Textual data from financial filings, e.g., the Management's Discussion \& Analysis (MDA) section in Form 10-K, has been used to improve the prediction accuracy of bankruptcy models. In practice, however, we cannot obtain the MDA section for all public companies. The two main reasons for the lack of MDA are: (i) not all companies are obliged to submit the MDA and (ii) technical problems arise when crawling and scrapping the MDA section. This research introduces for the first time, to the best of our knowledge, the concept of multimodal learning in bankruptcy prediction models to solve the problem that for some companies we are unable to obtain the MDA text. We use the Conditional Multimodal Discriminative (CMMD) model to learn multimodal representations that embed information from accounting, market, and textual modalities. The CMMD model needs a sample with all data modalities for model training. At test time, the CMMD model only needs access to accounting and market modalities to generate multimodal representations, which are further used to make bankruptcy predictions. This fact makes the use of bankruptcy prediction models using textual data realistic and possible, since accounting and market data are available for all companies unlike textual data. The empirical results in this research show that the classification performance of our proposed methodology is superior compared to that of a large number of traditional classifier models. We also show that our proposed methodology solves the limitation of previous bankruptcy models using textual data, as they can only make predictions for a small proportion of companies.
- Europe > Norway > Eastern Norway > Oslo (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > Indonesia > Bali (0.04)
- Banking & Finance > Trading (1.00)
- Government > Regional Government > North America Government > United States Government (0.46)
Predicting municipalities in financial distress: a machine learning approach enhanced by domain expertise
Piermarini, Dario, Sudoso, Antonio M., Piccialli, Veronica
Financial distress of municipalities, although comparable to bankruptcy of private companies, has a far more serious impact on the well-being of communities. For this reason, it is essential to detect deficits as soon as possible. Predicting financial distress in municipalities can be a complex task, as it involves understanding a wide range of factors that can affect a municipality's financial health. In this paper, we evaluate machine learning models to predict financial distress in Italian municipalities. Accounting judiciary experts have specialized knowledge and experience in evaluating the financial performance, and they use a range of indicators to make their assessments. By incorporating these indicators in the feature extraction process, we can ensure that the model is taking into account a wide range of information that is relevant to the financial health of municipalities. The results of this study indicate that using machine learning models in combination with the knowledge of accounting judiciary experts can aid in the early detection of financial distress, leading to better outcomes for the communities.
- Europe > Italy (0.29)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > France (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.89)
A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data Perspective
Zhao, Yu, Du, Huaming, Li, Qing, Zhuang, Fuzhen, Liu, Ji, Kou, Gang
Enterprise financial risk analysis aims at predicting the future financial risk of enterprises. Due to its wide and significant application, enterprise financial risk analysis has always been the core research topic in the fields of Finance and Management. Based on advanced computer science and artificial intelligence technologies, enterprise risk analysis research is experiencing rapid developments and making significant progress. Therefore, it is both necessary and challenging to comprehensively review the relevant studies. Although there are already some valuable and impressive surveys on enterprise risk analysis from the perspective of Finance and Management, these surveys introduce approaches in a relatively isolated way and lack recent advances in enterprise financial risk analysis. In contrast, this paper attempts to provide a systematic literature survey of enterprise risk analysis approaches from Big Data perspective, which reviews more than 250 representative articles in the past almost 50 years (from 1968 to 2023). To the best of our knowledge, this is the first and only survey work on enterprise financial risk from Big Data perspective. Specifically, this survey connects and systematizes the existing enterprise financial risk studies, i.e. to summarize and interpret the problems, methods, and spotlights in a comprehensive way. In particular, we first introduce the issues of enterprise financial risks in terms of their types,granularity, intelligence, and evaluation metrics, and summarize the corresponding representative works. Then, we compare the analysis methods used to learn enterprise financial risk, and finally summarize the spotlights of the most representative works. Our goal is to clarify current cutting-edge research and its possible future directions to model enterprise risk, aiming to fully understand the mechanisms of enterprise risk generation and contagion.
- Asia > China (1.00)
- Europe (0.67)
- North America > United States > Wisconsin (0.28)
- Overview (1.00)
- Research Report > Experimental Study (0.68)
- Research Report > New Finding (0.46)
- Banking & Finance > Trading (1.00)
- Banking & Finance > Economy (1.00)
- Banking & Finance > Credit (0.97)
- (5 more...)
Fallen Angel Bonds Investment and Bankruptcy Predictions Using Manual Models and Automated Machine Learning
Mateika, Harrison, Jia, Juannan, Lillard, Linda, Cronbaugh, Noah, Shin, Will
The primary aim of this research was to find a model that best predicts which fallen angel bonds would either potentially rise up back to investment grade bonds and which ones would fall into bankruptcy. To implement the solution, we thought that the ideal method would be to create an optimal machine learning model that could predict bankruptcies. Among the many machine learning models out there we decided to pick four classification methods: logistic regression, KNN, SVM, and NN. We also utilized an automated methods of Google Cloud's machine learning. The results of our model comparisons showed that the models did not predict bankruptcies very well on the original data set with the exception of Google Cloud's machine learning having a high precision score. However, our over-sampled and feature selection data set did perform very well. This could likely be due to the model being over-fitted to match the narrative of the over-sampled data (as in, it does not accurately predict data outside of this data set quite well). Therefore, we were not able to create a model that we are confident that would predict bankruptcies. However, we were able to find value out of this project in two key ways. The first is that Google Cloud's machine learning model in every metric and in every data set either outperformed or performed on par with the other models. The second is that we found that utilizing feature selection did not reduce predictive power that much. This means that we can reduce the amount of data to collect for future experimentation regarding predicting bankruptcies.
- Research Report > Experimental Study (0.55)
- Research Report > New Finding (0.35)
- Banking & Finance > Trading (1.00)
- Banking & Finance > Credit (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.30)
A Data-driven Case-based Reasoning in Bankruptcy Prediction
Li, Wei, Härdle, Wolfgang Karl, Lessmann, Stefan
There has been intensive research regarding machine learning models for predicting bankruptcy in recent years. However, the lack of interpretability limits their growth and practical implementation. This study proposes a data-driven explainable case-based reasoning (CBR) system for bankruptcy prediction. Empirical results from a comparative study show that the proposed approach performs superior to existing, alternative CBR systems and is competitive with state-of-the-art machine learning models. We also demonstrate that the asymmetrical feature similarity comparison mechanism in the proposed CBR system can effectively capture the asymmetrically distributed nature of financial attributes, such as a few companies controlling more cash than the majority, hence improving both the accuracy and explainability of predictions. In addition, we delicately examine the explainability of the CBR system in the decision-making process of bankruptcy prediction. While much research suggests a trade-off between improving prediction accuracy and explainability, our findings show a prospective research avenue in which an explainable model that thoroughly incorporates data attributes by design can reconcile the dilemma.
- North America > United States > New York > New York County > New York City (0.14)
- Asia > Singapore (0.04)
- Europe > Germany > Berlin (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Banking & Finance (1.00)
- Government > Regional Government (0.67)
- Information Technology > Security & Privacy (0.46)
- Health & Medicine > Therapeutic Area (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
- (3 more...)
Risk Automatic Prediction for Social Economy Companies using Camels
Gallego-Mejia, Joseph, Martin-Vega, Daniela, Gonzalez, Fabio
Governments have to supervise and inspect social economy enterprises (SEEs). However, inspecting all SEEs is not possible due to the large number of SEEs and the low number of inspectors in general. We proposed a prediction model based on a machine learning approach. The method was trained with the random forest algorithm with historical data provided by each SEE. Three consecutive periods of data were concatenated. The proposed method uses these periods as input data and predicts the risk of each SEE in the fourth period. The model achieved 76\% overall accuracy. In addition, it obtained good accuracy in predicting the high risk of a SEE. We found that the legal nature and the variation of the past-due portfolio are good predictors of the future risk of a SEE. Thus, the risk of a SEE in a future period can be predicted by a supervised machine learning method. Predicting the high risk of a SEE improves the daily work of each inspector by focusing only on high-risk SEEs.
- North America > United States (0.28)
- Asia > India (0.04)
- South America > Colombia > Bogotá D.C. > Bogotá (0.04)
- (2 more...)